A Two Stage Mask Estimation Approach to Robust Speaker Verification

نویسندگان

Yali Zhao

Lei Xie

Zhonghua Fu

چکیده

We propose a two-stage mask estimation approach to robust speaker verification (SV) in noise environments. We consider a practical semi-blind SV scenario: the location of the target speaker is fixed while the locations of all interferers are unknown. In the first stage, we use a dual-microphone and a semi-blind degenerate unmixing estimation technique (DUET) to estimate an initial binary mask. In the second stage, we refine the mask based on the time and frequency histograms of the initial mask. As a result, only highly reliable time-frequency components in the spectro-temporal features are kept for downstream verification. Experiments show that the proposed approach is superior to a baseline MFCC approach and a recent local SNR based mask estimation approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems

Speaker verification from talking a few words of sentences has many applications. Many methods as DTW, HMM, VQ and MQ can be used for speaker verification. We applied MQ for its precise, reliable and robust performance with computational simplicity. We also used pitch frequency and log gain contour for further improvement of the system performance.

متن کامل

Using Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems

متن کامل

Discriminative adaptation for speaker verification

Speaker verification is a binary classification task to determine whether a claimed speaker uttered a phrase. Current approaches to speaker verification tasks typically involve adapting a general speaker Universal Background Model (UBM), normally a Gaussian Mixture Model (GMM), to model a particular speaker. Verification is then performed by comparing the likelihoods from the speaker model to t...

متن کامل

Extended weighted linear prediction (XLP) analysis of speech and its application to speaker verification in adverse conditions

This paper introduces a generalized formulation of linear prediction (LP), including both conventional and temporally weighted LP analysis methods as special cases. The temporally weighted methods have recently been successfully applied to noise robust spectrum analysis in speech and speaker recognition applications. In comparison to those earlier methods, the new generalized approach allows mo...

متن کامل

Robust methods of updating model and a priori threshold in speaker verification

We describe a method of updating a hidden Markov model (HMM) for speaker verification using a small amount of new data for each speaker. The HMM is updated by adapting the model parameters to the new data by maximum a posteriori (MAP) estimation. The initial values of the a priori parameters in MAP estimation are set using training speech used for first creating a speaker HMM. We also present a...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

A Two Stage Mask Estimation Approach to Robust Speaker Verification

نویسندگان

چکیده

منابع مشابه

Using Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems

Using Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems

Discriminative adaptation for speaker verification

Extended weighted linear prediction (XLP) analysis of speech and its application to speaker verification in adverse conditions

Robust methods of updating model and a priori threshold in speaker verification

عنوان ژورنال:

اشتراک گذاری